Without Remnant Movement, MGs Are Context-Free
نویسنده
چکیده
Minimalist grammars offer a formal perspective on a popular linguistic theory, and are comparable in weak generative capacity to other mildly context sensitive formalism. Minimalist grammars allow for the straightforward definition of so-called remnant movement constructions, which have found use in many linguistic analyses. It has been conjectured that the ability to generate this kind of configuration is crucial to the super-context-free expressivity of minimalist grammars. This conjecture is here proven. In the minimalist program of [1], the well-formedness conditions on movementtype dependencies of the previous GB Theory [2] are reimplemented derivationally, so as to render ill-formed movement chains impossible to assemble. For example, the c-command restriction on adjacent chain links is enforced by making movement always to the root of the current subtree–a position ccommanding any other. One advantage of this derivational reformulation of chain well-formedness conditions is that so-called ‘remnant movement’ configurations, as depicted on the left in figure 1, are easy to generate. Remnant movement ocFig. 1. Remnant Movement (left) vs Non-Remnant Movement (right) curs when, due to previous movement operations, a moving expression does not itself have a grammatical description. Here we imagine that the objects derivable by the grammar in figure 1 include the black triangle and the complex of white and black triangles, but not the white triangle to the exclusion of the black triangle. From an incremental bottom-up perspective, the structure on the left in figure 1 first involves moving the grammatically specifiable black triangle, but then the non-directly grammatically describable white triangle moves. This is to be contrasted with the superficially similar configuration on the right in figure 1, in which, again from an incremental bottom-up perspective, both movement steps are of grammatically specifiable objects (the first step (here, the dotted line) involves movement of the complex of white and black triangles, and the second step (the solid line) involves movement of the black triangle). In particular, the dependencies generated by remnant movement are ‘crossing’, while those of the permissible type are nested (in the intuitive sense made evident in the figure). The formalism of Minimalist Grammars (MGs) [3] was shown in [4] to be mildly context-sensitive (see also [5]). The MGs constructed in the proof use massive remnant movement to derive the non-context-free patterns, inviting the question as to whether this is necessary. Here we show that it is. MGs without remnant movement derive all and only the context-free languages. This result holds even when the SMC (a canonical constraint on movement, see [6]) is relaxed in such a way as to render the set of well-formed derivation trees non-regular. In this case, the standard proof [4] that MGs are mildly context-sensitive no longer goes through. 1 Mathematical Preliminaries We assume familiarity with basic concepts of formal language theory. We write 2 for the power set of a set A, and, for f : A → B a partial function, dom(f) denotes the subset of A on which f is defined. Given a set Σ, Σ denotes the set of all finite sequences of elements from Σ, including the empty sequence ǫ. Σ is the set of all finite sequences over Σ of length greater than 0. For u, v ∈ Σ, uv is their concatenation. Often we will simply indicate concatenation via juxtaposition. A ranked alphabet is a set Σ together with a function rank : Σ → N assigning to each ‘function symbol’ in Σ a natural number indicating the arity of the function it denotes. If Σ is a ranked alphabet, we write Σi for the set {σ ∈ Σ : rank(σ) = i}. If σ ∈ Σi, we write σ to indicate this fact. Let Σ be a ranked alphabet, the set of terms over Σ is written TΣ, and is defined to be the smallest set containing each σ ∈ Σ0, and for each σ ∈ Σn, and t1, . . . , tn ∈ TΣ, the term σ(t1, . . . , tn). For X any set, and Σ a ranked alphabet, Σ ∪ X is also a ranked alphabet, where (Σ ∪ X)0 = Σ0 ∪ X , and (Σ ∪ X)i = Σi for all i > 0. We write TΣ(X) for TΣ∪X . A unary context over Σ is C ∈ TΣ({x}), such that x occurs exactly once in C. Given a unary context C and term T , we write C[t] to denote the result of substituting t in for x in C (x[t] = t, σ(t1, . . . , tn)[t] = σ(t1[t], . . . , tn[t])). A bottom-up tree automaton is given by a quadruple 〈Q,Σ,→, QF 〉, where Q is a finite set of states, Qf ⊆ Q is the set of final states, Σ is a ranked alphabet, and →⊂fin Σ × Q → Q. A bottom-up tree automaton defines a relation ⇒: TΣ(Q)×TΣ(Q). If C is a unary context over Σ ∪ Q, and 〈σ, q1, . . . , qn〉 → q, then C[σ(q1, . . . , qn)] ⇒ C[q]. The tree language accepted by a bottom-up tree automaton A is defined as L(A) = {t ∈ TΣ : ∃q ∈ QF . t ⇒ q}. A set of trees is regular iff it is the language accepted by some bottom-up tree automaton. 2 Minimalist Grammars We use the notation of [7]. An MG over an alphabetΣ is a tripleG = 〈Lex, sel, lic〉 where sel and lic are finite non-empty sets (of ‘selection’ and ‘licensing’ feature types), and for F = {=s, s : s ∈ sel} ∪ {+l, -l : l ∈ lic}, Lex ⊂fin Σ×F. Given binary function symbols ∆2 := {mrg1,mrg2,mrg3} and unary ∆1 := {mv1,mv2}, a derivation is a term in der(G) = T∆2∪∆1∪Lex, where elements of Lex are treated as nullary symbols. An expression is a finite sequence φ0, φ1, . . . , φn of pairs over Σ ∗ × F; the first component φ0 represents the yield and features of the expression (qua tree) minus any moving parts, and the remaining components represent the yield and features of the moving parts of the expression. Thus an expression of the form φ0 = 〈σ, γ〉 represents a tree with no moving pieces; such an expression is called a complete expression of category γ. Eval : der(G) → 2 ∗ ×F ) is a partial function mapping derivations to the sets of expressions they are derivations of. Given l ∈ Lex, Eval(l) = {l}, and Eval(mrgi(d1, d2)) and Eval(mvi(d)) are defined as {mergei(e1, e2) : ej ∈ Eval(dj)} and {movei(e) : e ∈ Eval(d)} respectively, where the operations mergei and movei are defined below. In the following, σ, τ ∈ Σ, γ, δ ∈ F, and φi, ψj ∈ Σ × F. 〈σ, =cγ〉 ∈ Lex 〈τ, c〉, ψ1, . . . , ψn 〈σ⌢τ, γ〉, ψ1, . . . , ψn merge1 〈σ, =cγ〉, φ1, . . . , φm 〈τ, c〉, ψ1, . . . , ψn 〈τ⌢σ, γ〉, φ1, . . . , φm, ψ1, . . . , ψn merge2 〈σ, =cγ〉, φ1, . . . , φm 〈τ, cδ〉, ψ1, . . . , ψn 〈σ, γ〉, φ1, . . . , φm, 〈τ, δ〉, ψ1, . . . , ψn merge3 〈σ, +cγ〉, φ1, . . . , φi−1, 〈τ, -c〉, φi+1, . . . , φm 〈τ⌢σ, γ〉, φ1, . . . , φi−1, φi+1, . . . , φm move1 〈σ, +cγ〉, φ1, . . . , φi−1, 〈τ, -cδ〉, φi+1, . . . , φm 〈σ, γ〉, φ1, . . . , φi−1, 〈τ, δ〉, φi+1, . . . , φm move2 The SMC is a restriction on the domains of move1 and move2 which render these relations functional. no φj = 〈σj , γj〉 is such that γj = -cγ ′ j unless j = i (SMC) The (string) language generated at a category c (for c ∈ sel) by a MG G is defined to be the yields of the complete expressions of category c: Lc(G) := {σ : ∃d ∈ der(G). 〈σ, c〉 ∈ Eval(d)}. 1 Implicit in [4] is the fact that for any c, domc(Eval) = {d : ∃σ. 〈σ, c〉 ∈ Eval(d)} is a regular tree language. This is explicitly shown in [8]. 3 A Ban on Remnant Movement In order to implement a ban on remnant movement, we want to implement a temporaray island status on moving expressions: nothing can move out of a moving expression until it has settled down (‘please wait until the train has come to a complete stop before exiting’). Currently, an expression e = φ0, φ1, . . . , φk has the form just given, where φ0 is the ‘head’ of the expression, and the other φi are ‘moving parts’. Importantly, although we view such an expression as a compressed representation of a tree, there is no hierarchical relation among the φi. In order to implement a ban against remnant movement, we need to indicate which of the moving parts are contained in which others. We represent this information by retaining some of the relative dominance relations in the represented tree: e = φ0, T1, . . . , Tn, where each tree Ti pairs a moving part with a (possibly empty) sequence of trees (the set of trees T is the smallest set X such that X = (Σ × F) × X). We interpret such a structure as a moving part (the features of which are represented by φi) which itself may contain moving subparts (T i 1, . . . , T i m). By allowing these moving subparts to become accessible for movement only after the features of φi have been exhausted, we rule out the crossing, remnant movement type dependencies. The revised cases of the operations merge and move, PBC-merge and PBC-move, are given below. The function PBC-Eval interprets derivations d ∈ der(G) in ‘PBC-mode,’ such that PBC-Eval(l) = {l}, PBC-Eval(mvi(d)) = {PBC-movei(e) : e ∈ PBC-Eval(d)}, and PBC-Eval(mrgi(d1, d2)) = {PBC-mergei(e1, e2) : ej ∈ PBC-Eval(dj)}. In the below, σ, τ are strings, γ, δ are finite sequences of syntactic features, Si, Tj are trees of the form 〈〈σ, γ〉, 〈S1, . . . , Sn〉〉. 〈σ, =cγ〉 ∈ Lex 〈τ, c〉, T1, . . . , Tn 〈σ⌢τ, γ〉, T1, . . . , Tn PBC-merge1 〈σ, =cγ〉, S1, . . . , Sm 〈τ, c〉, T1, . . . , Tn 〈τ⌢σ, γ〉, S1, . . . , Sm, T1, . . . , Tn PBC-merge2 〈σ, =cγ〉, S1, . . . , Sm 〈τ, cδ〉, T1, . . . , Tn 〈σ, γ〉, S1, . . . , Sm, 〈〈τ, δ〉, 〈T1, . . . , Tn〉〉 PBC-merge3 〈σ, +cγ〉, S1, . . . , Si−1, 〈〈τ, -c〉, 〈T1, . . . , Tn〉〉, Si+1, . . . , Sm 〈τ⌢σ, γ〉, S1, . . . , Si−1, T1, . . . , Tn, Si+1, . . . , Sm PBC-move1 〈σ, +cγ〉, S1, . . . , Si−1, 〈〈τ, -cδ〉, 〈T1, . . . , Tn〉〉, Si+1, . . . , Sm 〈σ, γ〉, S1, . . . , Si−1, 〈〈τ, δ〉, 〈T1, . . . , Tn〉〉, Si+1, . . . , Sm PBC-move2 2 The ‘PBC’ is named after the proper binding condition of [9], which filters out surface structures in which a trace linearly precedes its antecedent. If the antecedent of a trace left behind by a particular movement step is defined to be the element (trace or otherwise) in the target position of that movement, the present modification to the rules merge and move exactly implement the PBC in the minimalist grammar framework. We will continue to require that these rules satisfy (a version of) the SMC. Following [11], we define the SMC over PBC-move as follows: no Tj = 〈〈σj , γj〉, 〈T j 1 , . . . , T j n〉〉 is such that γj = -cγ ′ j unless j = i (PBC-SMC) The string language generated in PBC-mode at a category c is defined as usual: L c (G) := {σ : ∃d ∈ der(G).〈σ, c〉 ∈ PBC-Eval(d)}. Observe that the rule PBC-merge3 introduces new tree structure, temporarily freezing the moving pieces within its second argument. The rules PBC-move1 and PBC-move2 enforce that only the root of a tree is accessible to movement operations, and that its daughter subtrees become accessible to movement only once the root has finished moving. Note also that the set of well-formed derivation trees in PBC-mode (the set dom(PBC-Eval)) is not a regular tree language (this is due to the laxness of the PBC-SMC). To see this, consider the MG G1 = 〈Lex, {x, y}, {A}〉, where Lex contains the four lexical items below. a::=x x -A f::x c::=y +A y e::=x y Derivations of complete expressions of category y begin by starting with f, and repeatedly merging tokens of a. Then e is merged, and for each a, a c is merged, and a move step occurs. In particular, although the yields of these trees form the context-free language ceaf, the number of mrg3 nodes must be equal to the number of mv1 nodes. It is straightforward to show that no finite-state tree automaton can enforce this invariant. Our main result is that minimalist grammars under the PBC mode of derivation (i.e. using the rules just given above) generate exactly the class of contextfree languages. 4 MGs with Hypotheses Because the elimination of remnant movement guarantees that, viewed from a bottom-up perspective, we will finish moving a containing expression before we need to deal with any of its subparts, we can re-represent expressions using ‘slashfeatures’, as familiar from GPSG [12]. Accordingly, we replace (PBC-)merge3, 3 There are two natural interpretations of the SMC on expressions e = Φ, T1, . . . , Tn. First, one might require that no two φi and φj , share the same first feature, regardless of how deeply embedded within trees they may be. This perspective views the tree structure as irrelevant for the statement of the SMC. Another reasonable option is to require only that no two φi and φj share the same first feature, where φi and φj are the roots of trees Ti and Tj respectively. This perspective views the tree structure of moving parts as relevant to the SMC, and allows for a kind of ‘smuggling’ [10], as described in [11]. The results of this paper are independent of which of these two interpretations of the SMC we adopt. We adopt the second, because it is more interesting (the derivation tree sets no longer constitute regular tree languages). which introduces a to-be-moved expression, with a new (non-functional) operation, assume, which introduces a ‘slash-feature’, or hypothesis. A hypothesis takes the form of a pair of feature strings 〈δ, γ〉. The interpretation of a hypothesis 〈δ, γ〉, is such that δ records the originally postulated ‘missing’ feature sequence (and thus is unchanging over the lifetime of the hypothesis), whereas γ represents the remaining features of the hypothesis, which are checked off as the derivation progresses. Move1, which re-integrates a moving part into the main expression, is replaced with another new operation, discharge. Discharge replaces ‘used up’ hypotheses with the expressions that they ‘could have been.’ These expressions may themselves contain hypothesized moving pieces. Derivations of minimalist grammars with hypothetical reasoning in this sense are terms d ∈ Hyp-der(G) over a signature {mrg (2) 1 ,mrg (2) 2 ,assm ,mv (1) 2 ,dschrg }∪Lex, and Hyp-Eval partially maps such terms to expressions in the by now familiar manner. In the below, σ, τ are strings over Σ, γ, δ, ζ are finite sequences of syntactic features, φi, ψj are pairs of the form 〈δ, γ〉. 〈σ, =cγ〉 ∈ Lex 〈τ, c〉, ψ1, . . . , ψn 〈σ⌢τ, γ〉, ψ1, . . . , ψn merge1 〈σ, =cγ〉, φ1, . . . , φm 〈τ, c〉, ψ1, . . . , ψn 〈τ⌢σ, γ〉, φ1, . . . , φm, ψ1, . . . , ψn merge2 〈σ, =cγ〉, φ1, . . . , φm 〈σ, γ〉, φ1, . . . , φm, 〈cδ, δ〉 assume 〈σ, +cγ〉, φ1, . . . , φi−1, 〈δ, -c〉, φi+1, . . . , φm 〈τ, δ〉, ψ1, . . . , ψn 〈τ⌢σ, γ〉, φ1, . . . , φi−1, ψ1, . . . , ψn, φi+1, . . . , φm discharge 〈σ, +cγ〉, φ1, . . . , φi−1, 〈ζ, -cδ〉, φi+1, . . . , φm 〈σ, γ〉, φ1, . . . , φi−1, 〈ζ, δ〉, φi+1, . . . , φm move2 We subject the operations move2 and discharge to a version of the SMC: no φj = 〈ζj , γj〉 is such that γj = -cγ ′ j unless j = i (Hyp-SMC) The language of a minimalist grammar G at category c using hypothetical reasoning is defined to be: L c (G) := {σ : ∃d ∈ Hyp-der(G). 〈σ, c〉 ∈ Hyp-Eval(d)} The operation discharge constrains the kinds of assumptions introduced by assume which can be part of a well-formed derivation to be those which are of the form 〈cδ, δ〉, where there is some lexical item 〈σ, γcδ〉. As there are finitely many lexical items, there are thus only finitely many useful assumptions given a particular lexicon. It will be implicitly assumed in the remainder of this paper that assume is restricted so as to generate only useful assumptions. We henceforth index assm nodes with the features of the hypotheses introduced (writing thus assmcγ for an assume operation introducing the hypothesis 〈cγ, γ〉). Theorem 1. For any G, and any c ∈ selG, the set domc(Hyp-Eval) = {d : ∃σ. 〈σ, c〉 ∈ Hyp-Eval(d)} is a regular tree language. Proof. Construct a nondeterministic bottom-up tree automaton whose states are (|lic|+ 1)-tuples of pairs of suffixes of lexical feature sequences. The Hyp-SMC allows us to devote each component of such a sequence beyond the first to the (if it exists, unique) hypothesis beginning with a particular -c feature, and thus we assume to be given a fixed enumeration of lic. The remarks above guarantee that there are only a finite number of such states needed. Given an expression, φ0, φ1, . . . , φn, the state representing it has as its i th component the pair 〈ǫ, ǫ〉 if there is no φj beginning with the i th -c feature, and the unique φj beginning with the i -c feature otherwise. The 0 component of a state is always of the form 〈ǫ, γ〉, where γ is the feature sequence of φ0. As we are interested in derivations of complete expressions of category c, the final state is 〈〈ǫ, c〉, 〈ǫ, ǫ〉, . . . , 〈ǫ, ǫ〉〉. The transitions of the automaton are defined so as to preserve this invariant: at a lexical item l = 〈σ, γ〉, the automaton enters the state 〈〈ǫ, γ〉, 〈ǫ, ǫ〉, . . . , 〈ǫ, ǫ〉〉, and at an internal node σ(q1, . . . , qn), the automaton enters the state q just in case there are expressions e1, . . . , en represented by states q1, . . . , qn which are mapped by the operation denoted by σ to an expression e represented by state q. We use the facts that linear homomorphisms preserve recognizability and that the yield of a recognizable set of trees is context-free [13] in conjunction with theorem 1 to show that minimalist grammars using hypothetical reasoning define exactly the context-free languages. Theorem 2. For any G, and any c ∈ selG, L c (G) is context-free. Proof. Let G and c be given. By theorem 1, D = domc(Hyp-Eval) is recognizable. Let E = f [D], where f is the homomorphism defined as follows (f maps nullary symbols to themselves): f(σ(e1, . . . , en)) = { σ(f(e2), f(e1)) if σ ∈ {mrg2,dschrg} σ(f(e1), . . . , f(en)) otherwise Inspection of f reveals that it is merely putting sister subtrees in the order in which they are pronounced (à la Hyp-Eval) and thus, for any d ∈ D, Hyp-Eval(d) contains 〈σ, c〉 iff yield(f(d)) = σ. As f is linear, E is recognizable, and thus yield(E) = L c (G) is context-free. 5 Relating the PBC to Hypothetical Reasoning To show that minimalist grammars in PBC mode are equivalent to minimalist grammars with hypothetical reasoning we will exhibit an Eval-preserving bijection between complete derivation trees of both formalisms. The gist of the 4 A complete derivation tree is just one which is the derivation of a complete expression. I will in the following use the term in conjunction with derivations in der(G) to refer exclusively to expressions derived in PBC-mode. transformation is best provided via an example. Consider the trees in figure 2, which are derivations over the MG G1 in PBC mode and using hypothetical reasoning respectively of the string afcace.
منابع مشابه
A Note on Countercyclicity and Minimalist Grammars
Minimalist grammars (MGs), as introduced in Stabler (1997), have proven a useful instrument in the formal analysis of syntactic theories developed within the minimalist branch of the principles–and–parameters framework (cf. Chomsky 1995, 2000). In fact, as shown in Michaelis (2001), MGs belong to the class of mildly context–sensitive grammars. Interestingly, without there being a rise in (at le...
متن کاملLocality Conditions and the Complexity of Minimalist Grammars: A Preliminary Survey
Among the well-established variety of formal grammar types providing a mildly context-sensitive grammar (MCSG) formalism in the sense of Joshi (1985), Stabler’s minimalist grammars (MGs) (Stabler 1997, 1999) come closest to modeling the tools used in the Chomskyan branch of generative syntax known as “minimalism” (Chomsky 1995, 2000, 2001). Interestingly, without there being a rise in (at least...
متن کاملTwo Models of Minimalist, Incremental Syntactic Analysis
Minimalist grammars (MGs) and multiple context-free grammars (MCFGs) are weakly equivalent in the sense that they define the same languages, a large mildly context-sensitive class that properly includes context-free languages. But in addition, for each MG, there is an MCFG which is strongly equivalent in the sense that it defines the same language with isomorphic derivations. However, the struc...
متن کاملNotes on the Complexity of Complex Heads in a Minimalist Grammar
The type of a minimalist grammar (MG) introduced in Stabler 1997 provides a simple algebraic formalization of the perspectives as they arise from Chomsky 1995b within the linguistic framework of transformational grammar. As known (cf. Michaelis 2001a, 2001b, Harkema 2001), this MG–type defines the same class of derivable string languages as, e.g., linear context–free (string) rewriting systems ...
متن کاملTop-Down Recognizers for MCFGs and MGs
This paper defines a normal form for MCFGs that includes strongly equivalent representations of many MG variants, and presents an incremental priority-queue-based TD recognizer for these MCFGs. After introducing MGs with overt phrasal movement, head movement and simple adjunction are added without change in the recognizer. The MG representation can be used directly, so that even rather sophisti...
متن کامل